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Abstract 

Background: Long non-coding RNAs (IncRNAs) are transcripts that are 200 bp or longer, do not encode proteins, 
and potentially play important roles in eukaryotic gene regulation. However, the number, characteristics and 
expression inheritance pattern of IncRNAs in maize are still largely unknown. 

Results: By exploiting available public EST databases, maize whole genome sequence annotation and RNA-seq 
datasets from 30 different experiments, we identified 20,163 putative IncRNAs. Of these IncRNAs, more than 90% 
are predicted to be the precursors of small RNAs, while 1,704 are considered to be high-confidence IncRNAs. High 
confidence IncRNAs have an average transcript length of 463 bp and genes encoding them contain fewer exons 
than annotated genes. By analyzing the expression pattern of these IncRNAs in 13 distinct tissues and 105 maize 
recombinant inbred lines, we show that more than 50% of the high confidence IncRNAs are expressed in a 
tissue-specific manner, a result that is supported by epigenetic marks. Intriguingly, the inheritance of IncRNA expression 
patterns in 105 recombinant inbred lines reveals apparent transgressive segregation, and maize IncRNAs are less 
affected by cis- than by trans-genetic factors. 

Conclusions: We integrate all available transcriptomic datasets to identify a comprehensive set of maize IncRNAs, 
provide a unique annotation resource of the maize genome and a genome-wide characterization of maize IncRNAs, 
and explore the genetic control of their expression using expression quantitative trait locus mapping. 



Background 

While the central dogma defines the primary role for 
RNA as a messenger molecule in the process of gene ex- 
pression, there is ample evidence for additional functions 
of RNA molecules. These RNA molecules include small 
nuclear RNAs (snRNAs), small nucleolar RNAs (snoR- 
NAs; mainly tRNAs and rRNAs), signal recognition par- 
ticle (7SL/SRP) RNAs, microRNAs (miRNAs), small 
interfering RNAs (siRNAs), piwi RNAs (piRNAs) and 
trans -dieting siRNAs (ta-siRNAs), natural ds-acting siR- 
NAs and long noncoding RNAs (IncRNAs). IncRNAs 
have been arbitrarily defined as non-protein coding 
RNAs more than 200 bp in length, distinguishing them 
from short noncoding RNAs such as miRNAs and 
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siRNAs [1,2]. Rather, IncRNAs have been reported to 
influence the expression of other genes [2]. Based on the 
anatomical properties of their gene loci, IncRNAs were 
further grouped into antisense IncRNAs, intronic 
IncRNAs, overlapping IncRNAs that in part overlap 
protein-coding genes and intergenic IncRNAs [2], 
IncRNAs are usually expressed at low levels, lack conser- 
vation among species and often exhibit tissue-specific/ 
cell-specific expression patterns [3,4], 

With the advent of genomic sequencing techniques, 
genome-wide scans for IncRNAs have been conducted 
via cDNA/EST in silico mining [5,6], whole genome til- 
ling array and RNA-seq approaches [7,8] and epigenetic 
signature-based methods [9,10]. Thousands of IncRNAs 
have been identified in a number of species. For ex- 
ample, approximately 10,000 human IncRNAs were un- 
covered by the ENCODE Project [4], The finding that 
several hundred human IncRNAs interact with chroma- 
tin remodeling complexes suggests that they have 
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functional significance [9]. Indeed, some IncRNAs have 
been shown to influence human disease, plant develop- 
ment, and other biological processes [10-14]. 

Although less well characterized than mammalian 
IncRNAs, plant IncRNAs have defined functional roles. 
Vernalization in Arabidopsis is influenced by IncRNAs 
COOLAIR (an antisense IncRNA) and COLDAIR (an in- 
tronic IncRNA) [15,16]. INDUCED BY PHOSPHATE 
STARVATION1 is a member of the TPSl/Mt4 gene fam- 
ily that acts as a miR399 target mimic in fine tuning of 
PH02 (encoding an E2 ubiquintin conjugase-related en- 
zyme) expression and phosphate uptake in Arabidopsis, 
tomato and Medicago truncatula but does not encode a 
protein [17,18]. Enod40 was also identified as a IncRNA 
involved in nodulation [19,20]. Genome-wide scans for 
IncRNAs have also been performed in Arabidopsis thali- 
ana [21-27], Medicago truncatula [28], Oryza sativa 
[29] and Zea mays [30]. In maize, an in silico bioinfor- 
matic pipeline was used on a limited set of full-length 
cDNA sequences to identify 1,802 IncRNAs, of which 
60% are likely to be precursors of small RNAs [30] . Each 
of the IncRNA surveys in plants has uncovered a sub- 
stantial number of IncRNAs, which are often expressed 
at low levels in a tissue-specific manner as in humans 
and other mammals, and act as natural miRNA target 
mimics, chromatin modifiers or molecular cargo for pro- 
tein re-localization [1]. 

To identify a more comprehensive set of maize IncRNAs, 
we integrated the information from available public ESTs, 
maize whole genome sequence annotation, and RNA-seq 
datasets from 30 different experiments and developmental 
stages in the reference genotype of maize-B73. In total, 
1,704 high-confidence IncRNAs (HC-lncRNAs) and 18,459 
pre-lncRNAs (which are likely to be precursors of small 
RNAs) were identified in this analysis. The expression pat- 
terns and potential regulatory roles of these IncRNAs were 
examined in 30 B73 experiments and at several well- 
characterized loci. Finally, we explored the regulatory vari- 
ation of IncRNAs in an RNA-seq dataset of shoot apices 
from 105 genotypes of the maize intermated B73 x Mol7 
recombinant inbred line (IBM-RIL) population [31] to map 
the genetic factors underlying the expression variation of 
IncRNAs. These expression quantitative trait locus (eQTL) 
mapping results enhance our understanding of the inherit- 
ance of IncRNA expression in plants. 

Results 

Genome-wide identification of IncRNAs in maize 

We sought to identify a relatively comprehensive set of 
maize IncRNAs. To achieve this, it is important to re- 
move potential pseudogenes that have acquired non- 
sense or missense mutations as well as non-coding RNA 
precursors that will give rise to known classes of RNAs 
such as tRNAs, rRNA, and snRNAs. A comprehensive 



set of transcripts for the reference genotype B73 was de- 
veloped by combining data from two sources: the maize 
working gene set transcripts [32]; and de novo transcript 
assemblies from RNA-seq datasets from 30 different ex- 
periments (Figure 1A). There are 110,028 loci (136,774 
transcript isoforms) in the working gene set (WGS) of 
the maize genome annotation [33]. This set of genes 
consists of both computational predictions of genes as 
well as EST collections from a variety of tissues. Many 
analyses in maize utilize the 39,656 genes in the filtered 
gene set (FGS), a subset of the WGS that was selected 
based upon sequence similarity to other species and the 
existence of putative full-length coding sequences [32]. 
However, the WGS may include IncRNAs [30]. We also 
developed a set of transcript assemblies based upon 806 
million uniquely mapped reads from 30 different experi- 
ments of the reference genotype-B73 [34-39]. These se- 
quences were used to perform de novo transcript 
assembly with Cufflinks [40] and resulted in 83,623 
expressed loci with 98,444 transcript isoforms, of which 
16,759 loci and 17,696 transcript isoforms are not 
present in the WGS. The 110,028 loci (136,774 tran- 
script isoforms) from the WGS and 83,623 loci (98,444 
transcript isoforms) from the de novo transcript assem- 
blies were combined to generate a non-redundant set of 
126,787 transcribed loci (154,470 transcript isoforms) 
(Figure 1B,C). 

This comprehensive set of transcribed sequences from 
B73 was analyzed to identify putative IncRNAs. There are 
33,565 loci (38,967 transcript isoforms) that are at least 
200 bp in length and do not encode an ORF of more than 
100 amino acids. These sequences were filtered by com- 
paring with the Swiss- Protein database to eliminate tran- 
scripts that contain sequence similarity (E-value <0.001) to 
known protein domains. Further filtering was performed 
using the Coding Potential Calculator [41], which assesses 
the quality, completeness and sequence similarity of po- 
tential ORFs to proteins in the NCBI protein database. 
After applying these criteria, we identified 19,608 loci 
(20,163 transcript isoforms; Additional file 1) that encode 
transcripts of >200 bp but that have little evidence for 
coding potential, and that were considered as putative 
IncRNAs. These include 12,431 loci (12,647 isoforms) 
from the WGS and 7,177 loci (7,515 isoforms) from the de 
novo transcript assemblies. This set of putative IncRNAs 
also includes 1,580 sequences previously identified by 
Boerner and McGinnis [30] . 

These 20,163 putative IncRNAs may contain precur- 
sors to small RNA molecules, such as miRNAs, short 
hairpin RNAs (shRNAs) and siRNAs [30]. The putative 
IncRNAs were compared to a comprehensive set of 
small RNAs from different tissues and small RNA re- 
lated mutants. More than 90% (18,459) of the putative 
IncRNAs have sequence similarity with small RNAs 
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Figure 1 Informatics pipeline for the identification of maize IncRNAs. (A) Schematic diagram of the informatics pipeline. (B) The proportion 
of WGS transcripts with/without EST support. (C) Venn diagram showing the numbers of transcripts detected by the WGS, RNA-seq assemblies 
or by both assemblies. (D) The number of HC-lncRNAs and pre-lncRNAs derived from RNA-seq and WGS, respectively. (E) The proportion of 
transcripts from the WGS and RNA-seq with sequence similarity to maize repetitive elements. DB, DataBase; EST, expressed sequence tag; 
HC-lncRNAs, high confidence IncRNAs; ORF, open reading frame; WGS, working gene set. 



and were classified as pre-lncRNAs (Additional file 2; 
Materials and methods). A set of 1,704 IncRNAs that do 
not have sequence similarity to known classes of non- 
coding RNAs were defined as HC-lncRNAs (Additional 
file 3). These 1,704 HC-lncRNAs include 479 sequences 
from the WGS and 1,225 sequences from the de novo 
transcript assemblies (Figure ID). The HC-lncRNAs also 
contain 201 (35%) of the 572 HC-lncRNAs previously 
identified by Boerner and McGinnis [30]. RT-PCR was 
used to validate the expression and sequence for 24 
IncRNAs (Figure 2). The 24 putative IncRNAs selected 
for validation include 18 that were present in the work- 
ing gene set from the maize genome project [32] and 6 
that are novel transcripts from our assembly of RNA-seq 
data. RT-PCR was performed for root, leaf and shoot tis- 
sue of 2-week old B73 seedlings and the expected prod- 
ucts were recovered for 23 of the 24 IncRNAs tested. In 
some cases, there was evidence for tissue-specific ex- 
pression while many of the IncRNAs were detected in all 
three tissues. These RT-PCR bands and specific expres- 
sion were largely consistent (90/96) with the RNA-seq 
data. For example, IncRNA (GRMZM2G549431_T01) 



was not detected by both RNA-seq and RT-PCR in the leaf 
sample. Two of the IncRNAs (GRMZM2G010274_T01 
and GRMZM2G518002_T01) showed additional isoforms 
in some of the tissues that may reflect tissue-specific spli- 
cing variants. RT-PCR products from 10 IncRNAs were 
sequenced and all 10 exhibited the appropriate sequence. 
We proceeded to analyze characteristics, diversity and in- 
heritance patterns of these maize HC-lncRNAs. 

Characterization of maize IncRNAs 

A substantial number (74%) of the pre-lncRNAs have 
sequence similarity to repetitive sequences of maize 
(Figure IE). In contrast, the majority (98%) of the HC- 
lncRNAs do not contain maize repetitive sequences. 
Taken together, over 68% (13,811) of 20,163 maize puta- 
tive IncRNAs are repetitive sequences (or transposons), 
which is similar to the proportion of IncRNAs in mam- 
mals [42]. While the pericentromeric regions of most 
maize chromosomes have lower gene densities than 
chromosome arms' [32], maize IncRNAs are more evenly 
distributed across chromosomes (Figure 3A). The HC- 
lncRNAs were characterized according to the locations 
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Figure 2 RT-PCR validation of putative IncRNAs in root leaf 


and shoot of 2-week seedlings of maize inbreds (B73 and 


Mol 7). Twenty-four putative IncRNAs, including 19 HC-lncRNAs 


and 5 pre-lncRNAs, that exhibit expression in seedling tissue were 


selected for RT-PCR validation. Each primer set was used to perform 


RT-PCR on four RNA samples, including (1) B73 root, (2) B73 leaf, (3) 


B73 shoot, and (4) Mo17 shoot isolated from 2-week-old seedlings. 


Actin was used as a control to show amplification of cDNA samples 


but no amplification of untreated RNA samples. The marker is a 


100 bp DNA ladder from Invitrogen. 





relative to the nearest protein-coding genes. The majority 
of IncRNAs (93%) are located in intergenic regions and 
only 7% of the IncRNAs overlap with gene sequences. 
Among the intergenic HC-lncRNAs, 66 (3.9%) and 209 
(12.3%) are located within 5 kb upstream and downstream 
of genes, respectively (Figure 3B). The remaining 83.8% of 
intergenic IncRNAs are at least 5 kb away from the nearest 
gene. This proportion (83.8%) is significantly (P = 8.1E-09) 
higher than the proportion of FGS genes located at least 
5 kb from other FGS genes (32.6%). The majority of the 
IncRNAs are relatively short with very few (3%) greater 
than 1 kb in length (Figure 3C). Most (81%) of the 



IncRNAs consist of a single exon (Figure 3D). While we 
could not directly distinguish the transcript orientation 
using the non-strand-specific RNA-seq, transcript orienta- 
tions could be determined using the intron splicing 'GT- 
AG rule' for those HC-lncRNA genes that contain an in- 
tron. Of the 323 IncRNAs that could be oriented based on 
the GT-AG intron splice sites, 23 (7%) consist of antisense 
transcripts. 

The IncRNA sequences were compared with genomic 
sequences from Arabidopsis, rice and sorghum to deter- 
mine the portion of IncRNAs that had similarity 
(BLASTN E < 1.0E-10) to these species (Figure 3E). As 
expected, the conservation of IncRNAs is substantially 
lower than that of protein coding genes in comparisons 
with all three species. Permutations of random sam- 
plings of intergenic or intronic DNA were used to assess 
whether IncRNAs exhibit the same levels of conservation 
for these sequences among species. The IncRNAs have 
sequence similarity at the same rate as observed for 
intergenic sequences in all three cross-species compari- 
sons. The maize IncRNAs exhibit the same level of con- 
servation in Arabidopsis and rice as intronic sequences 
(P > 0.05) but they are significantly less conserved in sor- 
ghum (P < 0.01) than are randomly selected repeat- 
masked intronic sequences with similar length distribu- 
tion to IncRNAs (Figure 3E). 

The level of DNA methylation within and surrounding 
IncRNA genes was compared with that of the FGS genes 
in the reference genotype B73 (Additional file 4) [43]. 
Similar levels of DNA methylation are observed in re- 
gions 1 kb upstream and downstream of IncRNAs and 
FGS genes. For both the IncRNAs and FGS genes the 
level of DNA methylation is reduced near the transcrip- 
tion start and stop sites. FGS genes show substantial 
levels of gene body methylation in CG and CHG con- 
texts while the gene bodies of IncRNAs do not. Gene 
body methylation is often associated with genes with 
moderate to high levels of constitutive expression [44] 
and the lack of gene body methylation for IncRNAs may 
reflect lower or more variable expression for these genes. 
The CHH DNA methylation level is quite low for both 
FGS and IncRNA sequences. 

Variation in IncRNA expression among tissues 

The tissue-specificity of IncRNA expression was ex- 
plored using the RNA-seq data from 30 different sam- 
ples of B73 that represent 13 distinct tissue types. The 
Shannon entropy, which ranges from zero for genes 
expressed in a single tissue to log2 (Number of tissues) 
for genes expressed uniformly in all tissues considered, 
was employed to measure the tissue-specificity of 
IncRNA expression [45]. Many (54%) of the IncRNAs 
were only detected in one of the tissues (with at least 
four RNA-seq reads detected) and 10% of the IncRNAs 
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Figure 3 Characteristics of maize IncRNAs. (A) Distribution of IncRNAs along each chromosome. The abundance of HC-lncRNAs, pre-lncRNAs 
and FGS genes in physical bins of 10 Mb for each chromosome (generated using Circos). (B) Proportion of HC-lncRNAs and pre-lncRNAs that 
are located within 5 kb (upstream or downstream) or further than 5 kb from the nearest FGS gene. The proportion of FGS genes located within 
close proximity to other FGS genes is used as a control. (C) Lengths of HC-lncRNAs, pre-lncRNAs and FGS transcripts. (D) Numbers of exons in 
HC-lncRNAs, pre-lncRNAs and FGS. (E) Percentage of maize HC-lncRNAs and FGS transcripts that are conserved in the Arabidopsis, rice and 
sorghum genomes compared with the sequence conservation of intergenic or intronic fragments among species. Sequences were repetitive- 
sequence masked and aligned to the genomes of Arabidopsis, rice and sorghum with the significant cutoff E value <1.0E-10. 
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were detected in five or more tissues (Figure 4A,B). In 
contrast, only 8% of FGS genes were detected in only one 
tissue and 74% of FGS genes were detected in five or more 
tissues using the same expression criteria (Figure 4A,B). 
Interestingly, the male reproductive tissues (immature tas- 
sel, anther, and pollen) and embryo sac had more exam- 
ples of IncRNA expression than other tissues (Figure 4B). 
An analysis of the maximum expression level (reads per 
kilobase per million reads (RPKM)) for all 13 tissues pro- 
vided evidence that FGS genes tend to have higher expres- 
sion than IncRNAs (Figure 4C). However, 20% of the 
IncRNAs had an expression of >5 RPKM in at least one 
tissue, indicating that many of these sequences do show 
at least moderate expression levels in some tissues. In 
any one tissue, a higher proportion of FGS genes were 
expressed relative to HC-lncRNAs and expressed FGS 
genes had significantly higher expression levels than 
expressed HC-lncRNAs. This tissue-specific expression 



for many of the IncRNAs suggests that the expression of 
these sequences is biologically controlled rather than sim- 
ply reflecting 'transcriptional noise'. 

H3K27me3 is a facultative heterochromatin mark that 
is often associated with tissue-specific regulation of gene 
expression [46]. The levels of H3K27me3 (trimethylation 
of histone H3 lysine 27) for IncRNAs were assessed in five 
different tissues [46]. There are differences in the relative 
abundance of H3K27me3 over IncRNAs in different tis- 
sues of maize (Figure S2A in Additional file 5). The tissue 
with the lowest average level of H3K27me3, immature tas- 
sel, also exhibits expression for more of the IncRNAs than 
the other tissues, suggesting that H3K27me3 may be 
involved in regulating tissue-specific expression for 
IncRNAs. To assess the correlation between expression 
and H3K27me3 for the IncRNAs, H3K27me3 levels were 
contrasted for the IncRNAs that are expressed or silent in 
each of the tissues for which H3K27me3 profiles were 




Figure 4 Tissue-specific expression and expression levels of IncRNAs. (A) Density plot of Shannon entropy of pre- and HC-lncRNAs, and FGS 
transcripts. The Shannon entropy has units of bits and ranges from zero for genes expressed in a single tissue to log2(Number of tissues) for 
genes expressed uniformly in all tissues considered. (B) Hierarchical clustering (Ward's method) of expression for the HC-lncRNAs and FGS genes 
that were expressed in at least one tissue suggests that tissue-specific expression for IncRNAs is more common than that of FGS genes. Per-gene 
normalization was applied to allow for visualization of relative expression in different tissues for all genes. Red indicates high expression level, blue 
low expression, and black intermediate expression. SAM, shoot apical meristem. (C) Density plot of maximum expression levels of pre-lncRNAs 
(green), HC-lncRNAs (red), and FGS (blue) across 13 distinct tissues of B73. 
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available for analysis (Figure S2B in Additional file 5). In 
each tissue, genes were classified as not expressed (RPKM = 
0) or expressed (RPKM >1). In general, IncRNAs that are 
expressed tend to have lower levels (P< 0.001) of 
H3K27me3 while the IncRNAs silenced in any one tissue 
often have elevated H3K27me3 (Figure S2B in Additional 
file 5). The presence of H3K27me3 at silenced IncRNAs 
provides evidence for targeted regulation of the expression 
of these IncRNAs similar to what is observed at maize 
genes. 

HC-lncRNAs inheritance pattern in the maize IBM-RIL 
population 

The expression levels of HC-lncRNAs in shoot apices of 
105 maize IBM-RILs [31] were compared with the ex- 
pression levels in the parental lines for the 141 HC- 
IncRNAs that have detectable expression (at least 4 
reads/RIL) in more than 40% of the RILs. The expres- 
sion patterns of these 141 HC-lncRNAs were compared 
with those of genes in the FGS. The analysis of expres- 
sion levels in shoot apices of 105 IBM RILs provides evi- 
dence for higher levels of transgressive variation in 
expression levels of HC-lncRNAs than in FGS genes. 
The difference in expression for the RILs relative to B73 
or Mo 17 was compared by calculating (Exp par e nt s - u pro - 
geny)/o" P rogeny> which is expected to be centered at zero if 
the RILs generally have expression levels similar to the 
parents. In general, the HC-lncRNAs tend to be 
expressed in the RILs at levels similar to their parents 



but they have larger variation relative to the parents than 
observed for FGS genes (Figure 5A,B). This larger vari- 
ation for HC-lncRNAs than FGS genes may reflect the 
fact that most HC-lncRNAs have quite low expression 
levels. However, a targeted analysis of HC-lncRNAs and 
FGS genes with high expression levels (RPKM >10) re- 
vealed that even highly expressed HC-lncRNAs have lar- 
ger expression variation than FGS genes (Figure 5C,D). 
The deviation of expression levels from that of the two 
parents was calculated as a vector (Figure 5E) and shows 
evidence for higher deviation for HC-lncRNAs than for 
the FGS genes (P = 2.15E-20) (Figure 5F). This difference 
between HC-lncRNAs and FGS genes is observed for 
highly expressed transcripts but is not detected in tran- 
scripts with differential expression between the parents. 

Genetic dissection of expression-level variation of 
HC-lncRNAs by eQTL mapping 

The expression data from the 105 IBM RILs was used to 
map the regulatory regions of HC-lncRNA expression. 
eQTL mapping was conducted for 74 HC-lncRNAs de- 
tected in at least 80% of maize RILs using the expression 
levels in the 105 RILs as expression traits and a set of 
7,865 high quality SNP markers [31]. A total of 72 
eQTLs (a = 0.05) with a threshold logarithm of odds 
(LOD) >4.17 were identified for 49 HC-lncRNAs. The 
72 eQTLs include 21 (29%) ds-eQTLs and 51 (71%) 
trans-eQTLs (Figure 6A; Additional file 6), of which the 
proportion of trans- versus ds-eQTLs is slightly higher 
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Figure 5 Inheritance pattern of IncRNAs and FGS genes in 105 maize IBM RILs. (A-D) Two-dimensional kernel density estimation of gene 
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(P = 3.21E-03) than that observed for FGS genes [31]. 
Each HC-lncRNA or FGS gene was classified according 
to whether a higher proportion of expression variation 
was explained by cis- or trans-eQTL (Figure 6B). The 
HC-lncRNAs were more likely to have a major trans- 
acting eQTL than the FGS genes (Figure 6B). Previous 
eQTL studies in animals and plants have revealed that 
many loci influenced by multiple trans-eQTL have quite 
high levels of expression variation in segregating off- 
spring, presumably due to the potential for segregation 
of multiple eQTL with different directional effects that 
result in transgressive segregation [47]. This could explain 
why we observe higher levels of transgressive variation for 
IncRNAs as they are enriched for regulation by trans- 
eQTL relative to FGS genes (Figure 6A). The increased 
contribution of trans-acting regulation to expression vari- 
ation for HC-lncRNAs is consistent with the observation 
of higher levels of transgressive segregation for HC- 
lncRNA expression relative to FGS gene expression. 

We also dissected the genetic factors underlying the ex- 
pression variation of 67 HC-lncRNAs, which were 
expressed in more than 40% but less than 80% of the RILs, 
as these may represent HC-lncRNAs that are expressed 
from one haplotype but not the other. The eQTL mapping 
for these 67 HC-lncRNAs identified 72 eQTLs that influ- 
enced expression of 51 of these HC-lncRNAs (Additional 
file 7). These HC-lncRNAs are enriched for having pre- 
dominant effects of ds-eQTLs (72.5%) compared with the 
HC-lncRNAs that are expressed in over 80% of the RILs 
(40.8%). 

Furthermore, 460 HC-lncRNAs were expressed (with 
at least 4 RNA-seq reads detected) in less than 40% of 
the RILs. Most (80%) of these HC-lncRNAs were 
expressed at very low levels (the population mean is less 
than 5 RPKM); while only 94 HC-lncRNAs were de- 
tected with moderate expression levels (Additional file 
8). Of these moderately expressed HC-lncRNAs, only six 



were detected in more than 10% but less than 40% of 
the 105 RILs. In total, 88 out of 94 moderately expressed 
HC-lncRNAs were detected in only one of the 105 RILs. 
Taken together, these results indicate that complex regu- 
latory mechanisms may underlie HC-lncRNA expression 
variation. 

Potential functional roles for maize IncRNAs 

There are relatively few functionally characterized 
IncRNAs in maize. A careful analysis of the regulation of 
the Bl locus in maize identified a region located more 
than 100 kb upstream of the coding sequence [48] that 
is important for regulation and paramutation of Bl ex- 
pression. There is evidence for expression of a HC- 
lncRNA from this region [49] that may play a role in 
paramutation [50,51]. Similarly, we identified a HC- 
lncRNA (GRMZM2G580571_T01) in the regulatory re- 
gion of Bl, which was previously identified as required 
for B' paramutation (Figure 7A). There are several other 
examples of maize genes with long-distance regulatory 
elements. The map-based cloning of a major flowering 
time QTL, Vegetative to generative transitionl {Vgtl), 
identified a conserved non-coding region located 70 kb 
upstream of the ZmRap2 (GRMZM2G 700665) gene, 
which can influence flowering time [52]. We found a 
HC-lncRNA (TCONS_0008948S) that is expressed from 
the Vgtl regulatory region (Figure 7B). This IncRNA is 
detected in embryo sac and ovule tissues where ZmRap2 
is not detected, while ZmRap2 is expressed in other tis- 
sues where this HC-lncRNA is not detected, suggesting 
the potential for antagonistic expression of this HC- 
lncRNA and the nearby coding sequence. The cloning of 
a major domestication QTL in maize identified the teo- 
sinte branchedl (tbl) gene [53]. Further analyses pro- 
vided evidence for the importance of a distant enhancer 
located approximately 40 kb upstream of the coding se- 
quence [54] that may be influenced by a transpo- 
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son insertion [55]. We also identified a pre-lncRNA 
(TCONS_000 10027) derived from this genomic region in 
our study. This pre-lncRNA (TCONS_000 10027) has se- 
quence similarity with small RNAs and thus may be 
chopped into pieces and function as a small RNA. Be- 
cause IncRNAs showed strong tissue-specific expression 
patterns and relatively low expression levels, and none 
of these three InRNAs were detected in the tissue used 
for the eQTL analysis with 105 maize RILs, we could 
not conduct eQTL mapping for these IncRNAs. The 
finding that IncRNAs were detected from distant regula- 
tory regions in all three of these examples suggests that 
a number of the distant regulatory regions for these 
maize genes may potentially involve IncRNAs. The shoot 
apical meristem (SAM), from which all aboveground tis- 
sues of plants are derived, is critical to plant morphology 
and development [56]. While SAM initiation and de- 
velopment is characterized by distinct transcriptional 
variation [57], we also identified a subset of putative 
IncRNAs exhibiting distinct expression variation during 
different stages of SAM ontogeny (Additional file 9). 
Further research will be necessary to elucidate the func- 
tional roles of maize IncRNAs. 

Discussion 

The advent of high-resolution tiling arrays, the emer- 
gence of new technologies in the field of RNA-seq and 
large-scale chromatin immunoprecipitation experiments 



followed by next generation sequencing (ChlP-Seq), as 
well as cDNA-library sequencing and serial analysis of 
gene expression (SAGE), have allowed the research com- 
munity to quantitatively discriminate most of the cellular 
transcripts [58]. Each technical advance in examining 
the eukaryotic transcriptome has revealed the increasing 
complexity of eukaryotic genome expression [59]. One 
such complexity is the existence of non-protein coding 
genes, including short non-protein coding genes (such 
as small interfering RNAs and miRNAs) and long non- 
protein coding genes. The short noncoding RNAs are rela- 
tively well characterized and their importance in transcrip- 
tional and posttranscriptional regulation of expression of 
other genes is well understood [60]. In contrast, IncRNAs 
have not been as comprehensively identified or studied in 
many plant species. 

Our analysis generated a relatively robust list of poten- 
tial IncRNAs for maize. This set of IncRNAs will likely 
be useful for functional genomics research or the ana- 
lysis of potential functional differences among maize var- 
ieties. The IncRNAs detected in this analysis were 
identified from analysis of RNA-seq data from a diverse 
set of tissues and the current WGS annotation. In total, 
more than 20,000 putative IncRNAs, including 1,704 
HC-lncRNAs and 18,459 pre-lncRNAs, which are likely 
precursors of small RNAs, were identified. We have pro- 
vided GTF files as supplemental tables (Additional files 
2 and 3) to enable the use and display of these IncRNAs 
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by other researchers. Our study sheds light on the fea- 
tures and expression inheritance patterns of IncRNAs in 
maize, but also complements the reference genome an- 
notation of maize, which might further aid the func- 
tional gene cloning and trigger more comprehensive 
studies on gene regulation in plants. 

Despite our use of >1 billion RNA-seq reads, it is worth 
noting that we only detected expression for approximately 
80% of the maize FGS and approximately 50% of the 
IncRNAs (the other half are from WGS annotations). This 
may indicate that a number of additional IncRNAs with 
tissue- or environment-specific expression have not been 
detected. It is worth noting that we applied an RPKM cut- 
off for identifying expressed IncRNAs and that most 
IncRNAs were expressed at relatively low levels. While 
caution is required when quantifying the expression levels 
of genes with low RNA-seq coverage [61], focusing the 
analysis on IncRNAs with moderate expression may result 
in loss of IncRNAs with low expression. There are several 
other potential limitations to our list of IncRNAs. Most of 
the WGS, EST/cDNAs, and RNA-seq data were obtained 
after the reverse transcription with polyA primers, which 
selected for polyadenylated transcripts, and it is possible 
that some IncRNAs lack poly-adenylation. We have also 
employed relatively strict criteria by requiring that the pu- 
tative IncRNAs lack the ability to encode peptides of more 
than 100 amino acids or only have a weak coding poten- 
tial. However, there are examples of previously character- 
ized IncRNAs from other species that have the potential 
to encode peptides >100 amino acids, such as HOTAIR 
with 106 amino acids [62], XIST with 136 amino acids 
[63] and KCNQIOT with 289 amino acids [64]. These ex- 
amples are not thought to function as proteins but would 
not have met our relatively strict criteria for definition as 
IncRNAs. Although we have identified more than 20,000 
IncRNAs, it is likely that additional maize IncRNAs exist 
and will be discovered through analysis of additional tis- 
sues and genotypes or refinement of bioinformatics 
methods for characterizing IncRNAs. 

Conclusions 

As previous studies have suggested [1-28], a substantial 
number of IncRNAs exist in mammals and plants, and 
play important functional roles in human disease, plant 
development, and other biological processes. In this study, 
we integrated available transcriptome datasets in maize to 
identify maize IncRNAs. More than 20,000 IncRNAs were 
uncovered in the maize reference genome B73, of which 
1,704 were considered HC-lncRNAs. These HC-lncRNAs 
showed similar methylation levels as protein coding genes; 
however, they were more likely to exhibit tissue-specific 
expression patterns, which were also supported by epigen- 
etic marks. eQTL mapping of the HC-lncRNAs showed 
that trans-eQTL contribute more to the expression-level 



variation of IncRNAs. Finally, we identified IncRNAs that 
were derived from regulatory regions controlling Tbl, Vgtl, 
and Bl, which are key genes of developmental and agro- 
nomic importance in maize. We present the first compre- 
hensive annotation of IncRNAs in maize, which opens the 
door for future functional genomics studies and regulatory 
expression research. Our findings constitute a valuable gen- 
omic resource for the identification of IncRNAs underlying 
plant development and agronomic traits. We also identified 
potential genetic mechanisms that control expression vari- 
ation of IncRNAs in plant genomes. 

Materials and methods 

Datasets used for IncRNA identification 

Transcribed sequences from the maize reference inbred 
line B73 were collected from the Sequence Read Archive 
[65] and GenBank [66]. Data available in the Sequence 
Read Archive from the maize inbred line B73 included 30 
RNA-seq experiments from 13 distinct tissues (leaf, imma- 
ture ear, immature tassel, seed, endosperm, embryo, em- 
bryo sac, anther, ovule, pollen, silk, and root and shoot 
apical meristem) encompassing a total of 1.168 billion 
reads with read lengths ranging from 35 to 110 nucleo- 
tides (Additional file 10) [34-39]. The RNA-seq data were 
not derived from strand-specific sequencing. Hence, it was 
not possible to determine transcription orientation for 
transcripts that do not contain introns. Maize ESTs in- 
cluding full-length cDNAs used by Boerner and McGinnis 
[30] from a vast variety of tissues and stages were also col- 
lected from GenBank and integrated with the maize B73 
genome annotation (AGP v2) (Additional file 10) [32]. 

Bioinformatic pipeline for identifying IncRNAs 

The different sequence datasets were merged into one 
non-redundant set of transcript isoforms in maize, 
which was subjected to a series of filters to eliminate po- 
tential protein-coding transcripts (Figure 1). 

For the RNA-seq data, all sequenced reads from each 
experiment were aligned to the maize reference genome 
(AGP v2) using the spliced read aligner TopHat [33]. 
Then, a method of two iterations of TopHat alignments 
proposed by Cabili et al [3] was employed to maximize 
the use of splice site information derived from all samples. 
We then re-aligned each experiment using the pooled 
splice sites file. The transcriptome of each experiment was 
assembled separately using Cufflinks [40]. To reduce tran- 
scriptional noise, only those assembled transcript isoforms 
that were detected in two or more experiments were 
retained for further analyses. Then, we compared the as- 
sembled transcript isoforms with the maize genome anno- 
tation WGS, which represents all transcript isoforms 
identified by the maize genome project [32]. The RNA-seq 
dataset enabled us to identify 17,696 transcript isoforms 
from 16,759 unknown genomic loci after filtering with the 
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WGS. For the maize genome annotation-based transcripts 
[32], we combined maize ESTs and the WGS to eliminate 
transcripts from the WGS that were in silico annotated 
without expression evidence. The non-redundant tran- 
scripts supported by ESTs and/or RNA-seq were further 
filtered as follows (Figure 1). 

Size selection 

Putative IncRNAs were arbitrarily defined as transcripts 
that are >200 bp and have no or weak protein coding 
ability [1-28]. We used in house perl scripts to first ex- 
clude transcripts smaller than 200 bp. 

Open reading frame filter 

More than 95% of protein-coding genes have ORFs of 
more than 100 amino acids [67]. To remove transcripts 
with long ORFs, which are more likely to encode proteins, 
a Perl script was developed to ensure that transcripts that 
encode ORFs of 100 or less amino acids or incomplete 
ORFs were considered as IncRNA candidates. 

Known protein domain filter 

Transcripts were aligned to the Swiss-Protein database 
to eliminate transcripts with potential protein-coding 
ability (cutoff E-value <0.001). 

Protein-coding-score test 

The Coding Potential Calculator [41], which is based on 
the detection of quality, completeness, and sequence 
similarity of the ORF to proteins in current protein data- 
bases, was utilized to detect putative protein encoding 
transcripts with default parameters. Only transcripts that 
did not pass the protein-coding-score test were classified 
as IncRNAs. 

Elimination of housekeeping IncRNAs and precursors of 
small RNAs 

To rule out housekeeping IncRNAs (including tRNAs, 
snRNAs, and snoRNAs), putative IncRNAs were aligned 
to housekeeping IncRNA databases. The housekeeping 
IncRNA databases include the tRNA database down- 
loaded from the Genomic tRNA Database [68]; the 
rRNA database from the TIGR Maize Database [69]; and 
the snRNAs, snoRNAs, and signal recognition particle 
(7SL/SRP) collected from NONCODE [70]. IncRNA can- 
didates that have significant (P < 1.0E-10) alignment with 
housekeeping IncRNAs were not included in further 
analyses. Small RNAs in maize, which mainly consist of 
miRNAs, shRNAs and siRNAs, are generated from their 
precursors. The small RNA precursors are a special kind 
of IncRNA. To uncover this kind of IncRNA, we aligned 
putative IncRNAs with small RNA datasets [71] from 
multiple tissues, including leaf, ear, tassel, pollen, shoot 
and root, and different small RNA-related mutants, 



mopl and rmr2 [72-74], using the same cutoff values 
used by Boerner and McGinnis [30]. Here, we treated 
the putative IncRNAs containing homologous sequences 
to small RNAs as likely precursors of small RNAs; how- 
ever, some of them may indeed belong to IncRNAs. 
Conversely, although HC-lncRNAs have no significant 
alignment with small RNAs, they may still be precursors 
of small RNAs, which could be expressed at such low levels 
that they could not be detected using current sequenc- 
ing technology. Moreover, we also annotated IncRNAs by 
RepeatMasker [75] (repetitive database version 20130422 
from [76]) with default parameters. For the classification of 
anatomical relationships between IncRNA loci and protein- 
coding genes, ZmB73 5b annotation [32] was employed to 
distinguish intergenic IncRNAs from neighboring protein- 
coding genes. The source code for the IncRNA identifica- 
tion pipeline was released in the GitHub Repository [77]. 

The above protocol used to identify IncRNAs is similar to 
previous studies in mammals and plants [3-8,21-30]. How- 
ever, we employed more stringent criteria than did Boerner 
and McGinnis [30]. We used ORF <100 amino acids as the 
cutoff, whereas Boerner and McGinnis [30] used <120 
amino acids, and double filters of protein-coding potential 
(known protein domain filter and Protein-coding-score 
test). 

Validation of putative IncRNAs by RT-PCR 

To validate the putative IncRNAs we identified, we con- 
ducted RT-PCR of 24 putative IncRNAs in B73 and 
Mol7 tissues. We grew 10 plants of B73 and Mol7 and 
sampled the roots, leaves and shoot apices from 14-day- 
old seedlings. RNA from the roots, leaves and shoots of 
B73 and shoots of Mo 17 was isolated and used for first- 
strand cDNA reverse transcription by ImProm-IITM Re- 
verse Transcription System (Promega, Madison, WI, USA). 
A total of 24 putative IncRNAs were randomly selected for 
validation and RT-PCR was conducted on the IncRNAs 
using routine PCR programs (Tm = 60°C) with 35 amplifi- 
cation cycles. To control for genomic DNA contamination 
in our samples, the housekeeping gene Actin was used an 
as experimental control. All primer information can be 
found in Additional file 11. 

Sequence conservation of FGS, IncRNAs, intergenic and 
intronic fragments in Arabidopsis, rice and sorghum 

We employed 1,000 permutations of random sequences 
for the significance test of sequence conservation of the 
FGS, IncRNAs, and intergenic and intronic fragments as 
follows. First, we generated intronic and intergenic an- 
notation files based on the maize WGS annotation [32] and 
all transcripts identified using the RNA-seq data in this 
study. Second, we randomly selected a specific number (the 
same to that of HC-lncRNAs) of intronic and intergenic 
genomic regions. Third, we adjusted the selected genomic 
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regions according to the transcript length distribution of 
HC-lncRNAs. Fourth, we obtained the sequences of se- 
lected genomic regions based on the repeat-masked refer- 
ence genome sequence. Fifth, we aligned the randomly 
selected and length-adjusted intronic and intergenic se- 
quences against the whole genomes of Arabidopsis, rice 
and sorghum. Sixth, we summarized the proportion of 
aligned selected genomic regions with the Arabidopsis, 
rice and sorghum genomes with a cutoff E <1.0E-10. Sev- 
enth, we repeated steps 2 to 6 until the total permutation 
number reached up to 1,000. 

Expression, inheritance and genetic mapping of IncRNAs 

Two RNA-seq datasets were collected for the analyses of 
variation in IncRNA expression among tissues or among 
different genotypes: 1) RNA-seq data from 13 distinct 
tissues of the inbred line B73 from 30 experiments 
[34-39]; and 2) RNA-seq data from 2-week old seedling 
shoot apices of 105 maize RILs [31]. 

The expression levels (RPKM) of all transcripts were 
quantified in the two RNA-seq datasets and normalized 
using Cufflinks vO.9.3 [40] based on the uniquely 
mapped reads of each sample. The FGS genes [32], most 
of which are conserved among species and more likely 
to be protein-coding genes, were used as controls for 
the analysis. Tissue-specific analysis measured by Shan- 
non entropy [45] was conducted by expression-level pro- 
filing comparison between IncRNAs and the FGS. As 
previously reported [43], bisulfite sequencing was con- 
ducted on the DNA extracted from the third seedling 
leaf of B73. The DNA methylation levels in CG, CHG 
and CHH contexts of B73 were calculated for genomic 
regions from which IncRNAs are transcribed and the 
FGS and their flanking 1 kb genomic regions [43]. 
H3K27me3 levels of IncRNAs and FGS were obtained 
from data reported by Makarevitch et al [46]. For com- 
parison of epigenetic levels, transcription start and stop 
sites, and upstream and downstream regions were classi- 
fied based on ZmB73 5b annotations [32]. 

In our previous study [31], RNA-based sequencing by 
Illumina Hi-Seq2000 with 103 to 110 cycles were con- 
ducted on the pooled RNA samples of 2-week-old seed- 
ling shoot apices from three replicates per genotype for 
105 maize intermated B73 x Mol7 recombinant inbred 
lines (IBM-RILs), which were derived from the cross of 
the inbred lines B73 and Mol7 [78]. Uniquely mapped 
reads were employed to quantify the expression levels of 
IncRNAs and the FGS [31]. IncRNAs and genes com- 
prising the FGS, which were detected in the IBM-RILs, 
were extracted for expression inheritance pattern ana- 
lysis and genetic mapping. To quantify the expression 
inheritance of transcripts in the RILs relative to B73 or 
Mo 17, we used a statistic calculated by (Exp paren ts - IVo- 
geny)/o" P rogeny> where Exp parents shows the expression level 



in the two parents, u pr0 gen y indicates the mean value of 
the expression level in the progeny population and a pr0 g e ny 
represents the standard variation of the expression level in 
the progeny population for a specific gene. Any specific 
transcript could have two adjusted values, which measure 
the expression level deviation from that of the two parents 
(Figure 5E). The higher value the statistic is, the more de- 
viation the transcript exhibits in the progeny compared 
with that of the parents. This statistic is expected to be 
centered at zero if the RILs generally have expression 
levels similar to the parents. 

A high-resolution SNP genetic map of the IBM popu- 
lation based upon 7,856 high quality SNP markers from 
RNA-seq data was used to perform eQTL mapping for 
IncRNAs and FGS by using composite interval mapping. 
To obtain a global significance of 0.05 for the eQTL 
mapping, a permutation threshold was computed using 
1,000 randomly selected e-traits x 1,000 replicates. This 
threshold gave a likelihood ratio test value of 19.23, 
which corresponds to a LOD score of 4.17 as the signifi- 
cant cutoff of eQTL mapping. The confidence interval 
of eQTL was selected based on the range of a 1.0 LOD 
drop on each side from the LOD peak point. If two adja- 
cent peaks overlap in less than 10 cM, we considered 
them as one eQTL [31]. 

Additional files 



Additional file 1: Table SI. Characteristics of all putative IncRNA 
identified in this study. 

Additional file 2: Dataset SI. Annotation of pre-lncRNAs in the format 
of GTF. 

Additional file 3: Dataset S2. Annotation of HC-lncRNAs in the format 
of GTF. 

Additional file 4: Figure SI. Methylation levels of HC-lncRNAs and FGS 
genes. Percentage of DNA methylation in CG (black), CHG (red) and CHH 
(green) contexts is shown for HC-lncRNAs (solid lines) and FGS genes 
(dashed lines). Dashed vertical lines represent the presumed transcription 
start (left) and stop (right) for each IncRNA or gene with the length 
normalized to a value of 1,000. Regions to the left and right of the 
vertical dashed lines show DNA methylation levels in the 1,000 bp 
upstream of the presumed transcription start site (based upon ZmB73 5b 
annotations) or 1,000 bp downstream of the presumed transcription stop 
site, respectively. 

Additional file 5: Figure S2 H3K27me3 levels in maize HC-lncRNAs. (A) 
Variation in levels of H3K27me3 in HC-lncRNAs in different tissues of B73. 
The average level of H3K27me3 was plotted over the gene length (0 to 
1,000 represent the normalized length of each HC-lncRNA from presumed 
transcriptional start to presumed stop while the 1,000 bp upstream or 
downstream are actual lengths showing the level of H3K27me3 in surrounding 
regions) for five different tissues. (B) H3K27me3 levels of expression and silent 
HC-lncRNAs in each of the five different tissues. In each tissue, the genes were 
classified as not expressed (FPKM = 0) or expressed (FPKM >1). 

Additional file 6: Table S2. eQTL mapping of HC-lncRNA expressed in 
more than 80% of the RILs. a Chromosome position of e-traits. b Genetic 
position of e-traits. c The physical chromosomal location on the B73 
reference genome (AGPv2) of e-traits. d The middle physical position 
(equals the sum of the position of the transcription start site and the 
termination site divided by 2) of e-traits. e The genetic position of the 
peak of the eQTL. f The genetic position of the inferior support interval left 
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bound of the eQTL 9 The genetic position of the inferior support interval 
right bound of the eQTL. h The physical position of the peak of the eQTL 
on the B73 reference genome (AGPv2). 'The logarithm of odds (LOD) 
score of the eQTL. J The additive effect - the positive value indicates that 
the allele from Mo17 increases the phenotypic value. k The amount of 
expression variation of the e-trait explained by the eQTL. Type shows the 
relationship between e-traits and the eQTLs. 

Additional file 7: Table S3. eQTL mapping of HC-lncRNA expressed in 
more than 40% but less than 80% of the RILs. a Chromosome position of 
e-traits. b Genetic position of e-traits. c The physical chromosomal location 
on the B73 reference genome (AGPv2) of e-traits. d The middle physical 
position (equals the sum of the position of the transcription start site and 
the termination site divided by 2) of e-traits. ^he genetic position of the 
peak of the eQTL. f The genetic position of the inferior support interval left 
bound of the eQTL. 9 The genetic position of the inferior support interval 
right bound of the eQTL. h The physical position of the peak of the eQTL 
on the B73 reference genome (AGPv2). 'The logarithm of odds (LOD) 
score of the eQTL. J The additive effect - the positive value indicates that 
the allele from Mo17 increases the phenotypic value. k The amount of 
expression variation of the e-trait explained by the eQTL. Type shows the 
relationship between e-traits and the eQTLs. 

Additional file 8: Figure S3. The percent of RILs with expressed 
HC-lncRNAs and population mean of their expression levels in the RILs. 
The x-axis represents the percentage of RILs, while the y-axis indicates 
the population mean of RPKM. 

Additional file 9: Figure S4. LncRNA expression pattern across key 
stages in embryo development. The y-axis in each panel represents the 
scaled expression level among key stages (Pro, p roe mbryo; Trans, transition 
stage; L1, L1 stage; L14, L14 stage; Col, coleoptile stage; and LM, lateral 
meristem). Each line indicates one gene (in grey) or IncRNA (in blue). The 
red line shows the mean expression levels in each panel. The title shows the 
name of the expression level cluster and the number (in brackets) of genes 
and IncRNAs in each cluster. 

Additional file 10: Table S4. Datasets used in this study. The 
preliminary RNA-seq analyses were conducted using TopHat [33] and 
Cufflinks [40] with the B73 reference genome AGPv2 [32]. 

Additional file 11: Table S5. Primer information used for IncRNA validation. 
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